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TITLE: Interaction trap systems for detecting protein interactions 



Abstract Text (1) : 

Disclosed herein is a method of determining whether a first protein is capable of 
physically interacting with a second protein, involving: (a) providing a host cell 
which contains (i) a reporter gene operably linked to a protein binding site; (ii) 
a first fusion gene which expresses a first fusion protein, the first fusion 
protein including the first protein covalently bonded to a binding moiety which is 
capable of specifically binding to the protein binding site; and (iii) a second 
fusion gene which expresses a second fusion protein, the second fusion protein 
including the second protein covalently bonded to a gene activating moiety and 
being con format ionally-cons trained ; and (b) measuring expression of the reporter 
gene as a measure of an interaction between the first and the second proteins. Also 
disclosed are methods for assaying protein interactions, and identifying 
antagonists and agonists of protein interactions. 

INVENTOR (1) : 
Brent ; Roger 

Brief Summary Text (5) : 

Accordingly, in one aspect, the invention features a method of determining whether 
a first protein is capable of physically interacting with a second protein. The 
method includes (a) providing a host cell which contains (i) a reporter gene 
operably linked to a DNA-binding-protein recognition site; (ii) a first fusion gene 
which expresses a first fusion protein, the first fusion protein comprising the 
first protein covalently bonded to a binding moiety which is capable of 
specifically binding to the DNA-binding-protein recognition site; and (iii) a 
second fusion gene which expresses a second fusion protein, the second fusion 
protein including the second protein covalently bonded to a gene activating moiety 
and being conformat ionally-cons trained ; and (b) measuring expression of the 
reporter gene as a measure of an interaction between the first and said second 
proteins . 

Brief Summary Text (6) : 

Preferably, the second protein is a short peptide of at least 6 amino acids in 
length and is less than or equal to 60 amino acids in length; includes a randomly 
generated or intentionally designed peptide sequence; includes one or more loops; 
or is conf ormationally-constrained as a result of covalent bonding to a 
conformation-constraining protein, e.g., thioredoxin or a thioredoxin -like 
molecule . Where the second protein is covalently bonded to a conf ormationally 
constraining protein the invention features a polypeptide wherein the second 
protein is embedded within the conformation-constraining protein to which it is 
covalently bonded. Where the conformation-constraining protein is thioredoxin, the 
invention also features an additional method which includes a second protein which 
is conformat ionally-const rained by disulfide bonds between cysteine residues in the 
amino-terminus and in the carboxy-terminus of the second protein. 

Brief Summary Text (7) : 
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In another aspect, the invention features a method of detecting an interacting 
protein in a population of proteins, comprising: (a) providing a host cell which 
contains (i) a reporter gene operably linked to a DNA-binding-protein recognition 
site; and (ii) a fusion gene which expresses a fusion protein, the fusion protein 
including a test protein covalently bonded to a binding moiety which is capable of 
specifically binding to the DNA-binding-protein recognition site; (b) introducing 
into the host cell a second fusion gene which expresses a second fusion protein, 
the second fusion protein including one of said population of proteins covalently 
bonded to a gene activating moiety and being conformational ly- cons trained ; and (c) 
measuring expression of the reporter gene. Preferably, the population of proteins 
includes short peptides of between 1 and 60 amino acids in length. 

Brief Summary Text (8) : 

The invention also features a method of detecting an interacting protein within a 
population wherein the population of proteins is a set of randomly generated or 
intentionally designed peptide sequences, or where the population of proteins is 
conformationally-cons trained by covalently bonding to a conformation-constraining 
protein. Preferably, where the population of proteins is conf ormationally- 
constrained by covalent bonding to a conformation-constraining protein, the 
population of proteins is embedded within the conformation-constraining protein. 
The invention further features a method of detecting an interacting protein within 
a population wherein the conformation-constraining protein is thioredoxin . 
Preferably, the population of proteins is inserted into the active site loop of the 
thioredoxin . 

Brief Summary Text (9) : 

The invention further features a method wherein each of the population of proteins 
is conformationally-constrained by disulfide bonds between cysteine residues in the 
amino-terminus and in the carboxy-terminus of said protein. 

Brief Summary Text (12) : 

In another related aspect, the invention features a method of identifying a 
candidate interactor. The method includes (a) providing a reporter gene operably 
linked to a DNA-binding-protein recognition site; (b) providing a first fusion 
protein, which includes a first protein covalently bonded to a binding moiety which 
is capable of specifically binding to the DNA-binding-protein recognition site; (c) 
providing a second fusion protein, which includes a second protein covalently 
bonded to a gene activating moiety and being conformationally-constrained, the 
second protein being capable of interacting with said first protein; (d) contacting 
said candidate interactor with said first protein and/or said second protein; and 
(e) measuring expression of said reporter gene. 

Brief Summary Text (15) : 

In a preferred embodiment, the candidate interactor is conformationally-constrained 
and may include one or more loops. Where the candidate interactor is an antagonist, 
reporter gene expression is reduced. Where the candidate interactor is an agonist, 
reporter gene expression is increased. The candidate interactor is a member 
selected from the group consisting of proteins, polynucleotides, and small 
molecules . In addition, a candidate interactor can be encoded by a member of a cDNA 
or synthetic DNA library. Moreover, the candidate interactor can be a mutated form 
of said first fusion protein or said second fusion protein. 

Brief Summary Text (16) : 

In a preferred embodiment of any of the above aspects, the candidate interactor is 

isolated in vitro and shown to function in vivo, i.e., as a conf ormationally 
constrained intracellular peptide. 

Brief Summary Text (17): 

In a related aspect, the invention features a population of eukaryotic cells, each 
cell having a recombinant DN A molecule encoding a conformationally-constrained 
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intracellular peptide, there being at least 100 different recombinant molecules in 
the population, each molecule being in at least one cell of said population. 

Brief Summary Text (18) : 

Preferably, the intracellular peptides within the population of cells are 
conf ormationally-cons trained because they are covalently bonded to a conformation- 
constraining protein. 

Brief Summary Text (19) : 

In preferred embodiments the intracellular peptide is embedded within the 
conformation-constraining protein, preferably thioredoxin; the intracellular 
peptide is conf ormationally-constrained by disulfide bonds between cysteine 
residues in the amino-terminus and in the carboxy-terminus of said second protein; 
the intracellular peptide includes one or more loops; the population of eukaryotic 
cells are yeast cells; the recombinant DNA molecule further encodes a gene 
activating moiety covalently bonded to said intracellular peptide; and/or the 
intracellular peptide physically interacts with a second recombinant protein inside 
said eukaryotic cells. 

Brief Summary Text (20) : 

In another aspect, the invention features a method of assaying an interaction 
between a first protein and a second protein. The method includes: (a) providing a 
reporter gene operably linked' to a DNA-binding-protein recognition site; (b) 
providing a first fusion protein including a first protein covalently bonded to a 
binding moiety which is capable of specifically binding to the DNA-binding-protein 
recognition site; (c) providing a second fusion protein including a second protein 
which is conf ormationally constrained (and may include one or more loops) and is 
covalently bonded to a gene activating moiety; (d) combining the reporter gene, the 
first fusion protein, and the second fusion protein; and (e) measuring expression 
of the reporter gene. 

Brief Summary Text (21) : 

In a preferred embodiment, the invention further features a method of assaying the 
interaction between two proteins wherein the first fusion protein is provided by 
providing a first fusion gene which expresses the first fusion protein and wherein 
the second fusion protein is provided by providing a second fusion gene which 
expresses the second fusion protein. In another preferred embodiment, the 
interaction is assayed in vitro and shown to function in vivo, i.e., as a 
conf ormationally constrained intracellular peptide. 

Brief Summary Text (22) : 

In yet other aspects, the invention features a protein including the sequence Leu- 
Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Le u-Phe (SEQ 
ID NO: 1), preferably conf ormationally-constrained ; protein including the sequence 
Met-Val-Val-Ala-Ala-Glu-Ala-Val-Arg-Thr-Val-Leu-Leu-Ala-Asp-Gly-Gly-Asp-Va 1-Thr 
(SEQ ID NO: 2) ; preferably conf ormationally-constrained ; a protein including the 
sequence Pro-Asn-Trp-Pro-His-Gln-Leu-Arg~Val-Gly-Arg-Val-Leu~Trp-Glu-Arg-Leu-Ser-Ph 
e-Glu (SEQ ID NO: 3), preferably conf ormationally-constrained ; a protein including 
the sequence Ser-Val-Arg-Met-Arg-Tyr-Gly-Ile-Asp-Ala-Phe-Phe-Asp-Leu-Gly-Gly-Leu- 
Leu-Hi s-Gly (SEQ ID NO: 9), preferably conf ormationally-constrained ; a protein 
including the sequence Glu-Leu-Arg-His-Arg-Leu-Gly-Arg-Ala-Leu-Ser-Glu-Asp-Met-Val- 
Arg-Gly-Leu-Al a-Trp-Gly-Pro-Thr-Ser-His-Cys-Ala-Thr-Val-Pro-Gly-Thr-Ser-Asp-Leu- 
Trp-Arg-V al-Ile-Arg-Phe-Leu (SEQ ID NO: 10), preferably conf ormationally- 
constrained ; a protein including the sequence Tyr-Ser-Phe- Val-His-His-Gly-Phe-Phe- 
Asn-Phe-Arg-Val-Ser-Trp-Arg-Glu-Met-Le u-Ala (SEQ ID NO: 11), preferably 
conf ormationally-constrained ; a protein including the sequence Gln-Val-Trp-Ser-Leu- 
Trp-Ala-Leu-Gly-Trp-Arg-Trp-Leu-Arg-Arg-Tyr-Gly-Trp-As n-Met (SEQ ID NO: 12), 
preferably conf ormationally-constrained ; a protein including the sequence Trp-Arg- 
Arg~Met-Glu-Leu-Asp-Ala-Glu-Ile-Arg-Trp-Val-Lys-Pro-Ile-Ser-Pro-Le u-Glu (SEQ ID 
NO: 13), preferably conf ormationally-constrained ; a protein including the sequence 
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Trp-Ala-Glu-Trp-Cys-Gly-Pro-Val-Cys-Ala-His-Gly-Ser-Arg-Ser-Leu-Thr-Leu-Le u-Thr- 
Lys-Tyr-His-Val-Ser-Phe-Leu-Gly-Pro-Cys-Lys-Met-Ile-Ala-Pro-Ile-Leu-A sp (SEQ ID 
NO: 17), preferably conf ormationally-cons trained ; a protein including the sequence 
Leu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Le u-Phe 
(SEQ ID NO: 18), preferably conf ormationally-const rained ; a protein including the 
sequence Tyr-Arg-Trp~Gln-Gln-Gly-Val-Val-Pro-Ser-Asn-Trp-Ala-Ser-Cys-Ser-Phe-Arg-Cy 
s-Gly (SEQ ID NO: 19), preferably conf ormationally-cons trained ; a protein including 
the sequence Ser-Ser-Phe-Ser-Leu-Trp-Leu-Leu-Met-Val-Lys-Ser-Ile-Lys-Arg-Ala-Ala- 
Trp-Gl u-Leu-Gly-Pro-Ser-Ser-Ala-Trp-Asn-Thr-Ser-Gly-Trp-Ala-Ser-Leu-Ala-Asp-Phe-T 
yr (SEQ ID NO: 20) preferably conf ormationally-const rained ; a protein including the 
sequence Arg-Val-Lys-Leu-Gly-Tyr-Ser-Phe-Trp-Ala-Gln-Ser-Leu-Leu-Arg-Cys-Ile-Ser-Va 
1-Gly (SEQ ID NO: 21), preferably conf ormationally-cons trained ; a protein including 
the sequence Gln-Leu-Tyr-Ala-Gly-Cys-Tyr-Leu-Gly-Val-Val-Ile-Ala-Ser-Ser-Leu-Ser- 
Ile-Ar g-Val (SEQ ID NO: 22), preferably conf ormationally-cons trained ; a protein 
including the sequence Gln-Gln-Arg-Phe-Val-Phe-Ser-Pro-Ser-Trp-Phe-Thr-Cys-Ala-Gly- 
Thr-Ser-Asp-Ph e-Trp-Gly-Pro-Glu-Pro-Leu-Phe-Asp-Trp-Thr-Arg-Asp (SEQ ID NO: 23), 
preferably conf ormationally-cons trained ; a protein including the sequence Arg-Pro- 
Leu-Thr-Gly-Arg-Trp-Val-Val-Trp-Gly-Arg-Arg-His-Glu-Glu-Cys-Gly-Le u-Thr (SEQ ID 
NO: 24), preferably conf ormationally-constrained ; a protein including the sequence 
Pro-Val-Cys-Cys-Met-Met-Tyr-Gly-His-Arg-Thr-Ala-Pro-His-Ser-Val-Phe-Asn-Va 1-Asp 
(SEQ ID NO: 25), preferably conf ormationally-constrained ; a protein including the 
sequence Trp-Ser-Pro-Glu-Leu-Leu-Arg-Ala-Met-Val-Ala-Phe-Arg-Trp-Leu-Leu-Glu-Arg-Ar 
g-Pro (SEQ ID NO: 26); and substantially pure DNA encoding the immediately 
foregoing proteins. 

Brief Summary Text (23) : 

The invention also includes novel proteins and other candidate interactors 
identified by the foregoing methods. It will be appreciated that these proteins and 
candidate interactors may either increase or decrease reporter gene activity and 
that these changes in activity may be measured using assays described herein or 
known in the art. Also included in the invention are methods for using 
conformationally constrained interactor proteins. For example, the conf ormationally 
constrained proteins of the invention may be used as reagents in assays for protein 
detection that involve formation of a complex between the conformationally 
constrained protein and a protein of interest to which it specifically binds, 
followed by complex detection (for example, by an immunoprecipitation, Western 
blot, or affinity column technique that utilizes the conformationally constrained 
protein as the complex-forming reagent) . 

Brief Summary Text (24): 

Finally, the invention features a method of assaying an interaction between a first 
protein and a second protein, involving: (a) providing the first protein; (b) 
providing a fusion protein including the second protein, the second protein being 
conf ormationally-cons trained ; (c) contacting the first protein with the fusion 
protein under conditions which allow complex formation; (d) detecting the complex 
as an indication of an interaction; and (e) determining whether the first protein 
interacts with the fusion protein inside a cell. 

Brief Summary Text (26) : 

By "operably linked" is meant that a gene and a regulatory sequence (s) are 
connected in such a way as to permit gene expression when the appropriate molecules 
(e.g., transcriptional activator proteins or proteins which include transcriptional 
activation domains) are bound to the regulatory sequence (s) . 

Brief Summary Text (27): 

By "covalently bonded" is meant that two domains are joined by covalent bonds, 
directly or indirectly. That is, the "covalently bonded" proteins or protein 
moieties may be immediately contiguous or may be separated by stretches of one or 
more amino acids within the same fusion protein. 
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Brief Summary Text (30) : 

By a "binding moiety" is meant a stretch of amino acids which is capable of 
directing specific polypeptide binding to a particular DNA sequence (i.e., a "DNA- 
binding-protein recognition site") . 

Brief Summary Text (33) : 

By " conf ormationally-cons trained " is meant a protein that has reduced structural 
flexibility because its amino and carboxy termini are fixed in space. As a result 
of this constraint, the protein may form "loops" (i.e., regions of amino acids of 
any shape which extend away from the constrained amino and carboxy termini) . 
Preferably, the conf ormationally-constrained protein is displayed in a structurally 
rigid manner. Conformational constraint according to the invention may be brought 
about by exploiting the disulf ide-bonding ability of a natural or recombinantly- 
introduced pair of cysteine residues, one residing at or near the amino-terminal 
end of the protein of interest and the other at or near the carboxy-terminal end. 
Alternatively, conformational constraint may be facilitated by embedding the 
protein of interest within a conformation-constraining protein. 

Brief Summary Text (34): 

By "conformation-constraining protein" is meant any peptide or polypeptide which is 
capable of reducing the flexibility of another protein's amino and/or carboxy 
termini. Preferably, such proteins provide a rigid scaffold or platform for the 
protein of interest. In addition, such proteins preferably are capable of providing 
protection from proteolytic degradation and the like, and/or are capable of 
enhancing solubility. Examples of conformation-constraining proteins include 
thioredoxin and other thioredoxin -like proteins, nucleases (e.g., RNase A), 
proteases (e.g., trypsin), protease inhibitors (e.g., bovine pancreatic trypsin 
inhibitor), antibodies or structurally-rigid fragments thereof, conotoxins, and the 
pleckstrin homology domain. A conformation-constraining peptide can be of any 
appropriate length and can even be a single amino acid residue. 

Brief Summary Text (35) : 

" Thioredoxin -like proteins" are defined herein as amino acid sequences 
substantially similar, e.g., having at least 18% homology, with the amino acid 
sequence of E. coli thioredoxin over an amino acid sequence length of 80 amino 
acids. Alternatively, a thioredoxin -like DNA sequence is defined herein as a DNA 
sequence encoding a protein or fragment of a protein characterized by having a 
three dimensional structure substantially similar to that of human or E. coli 
thioredoxin, e.g., glutaredoxin and optionally by containing an active-site loop. 
The DNA sequence of glutaredoxin is an example of a thioredoxin -like DNA sequence 
which encodes a protein that exhibits such substantial similarity in three- 
dimensional conformation and contains a Cys . . . Cys active-site loop. The amino 
acid sequence of E. coli thioredoxin is described in Eklund et al., EMBO J. 3:1443- 
1449 (1984). The three-dimensional structure of E. coli thioredoxin is depicted in 
FIG. 2 of Holmgren, J. Biol. Chem. 264:13963-13966 (1989). A DNA sequence encoding 
the E. coli thioredoxin protein is set forth in Lim et al., J. Bacteriol., 163:311- 
316 (1985) . The three dimensional structure of human thioredoxin is described in 
Forman-Kay et al., Biochemistry 30:2685-98 (1991). A comparison of the three 
dimensional structures of E. coli thioredoxin and glutaredoxin is published in Xia, 
Protein Science 1:310-321 (1992). These four publications are incorporated herein 
by reference for the purpose of providing information on thioredoxin -like proteins 
that is known to one of skill in the art. Examples of thioredoxin -like proteins are 
described herein. 

Brief Summary Text (37) : 

"Compounds" include small molecules, generally under 1000 MW, carbohydrates, 
polynucleotides, lipids, and the like. 

Brief Summary Text (41) : 

By " intracellular " is meant that the peptide is localized inside the cell, rather 
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than on the cell surface. 
Brief Summary Text (45) : 

In addition, the claimed methods make use of conf ormationally-const rained proteins 
(i.e., proteins with reduced flexibility due to constraints at their amino and 
carboxy termini) . Conformational constraint may be brought about by embedding the 
protein of interest within a conformation-constraining protein (i.e., a protein of 
appropriate length and amino acid composition to be capable of locking the 
candidate interacting protein into a particular three-dimensional structure) . 
Examples of conformation-constraining proteins include, but are not limited to, 
thioredoxin (or other thioredoxin -like proteins), nucleases (e.g., RNase A), 
proteases (e.g., trypsin), protease inhibitors (e.g., bovine pancreatic trypsin 
inhibitor), antibodies or structurally-rigid fragments thereof, conotoxins, and the 
pleckstrin homology domain. 

Brief Summary Text (46): 

Alternatively, conformational constraint may be accomplished by exploiting the 
disulf ide-bonding ability of a natural or recombinantly-introduced pair of cysteine 
residues, one residing at the amino terminus of the protein of interest and the 
other at its carboxy terminus. Such disulfide bonding locks the protein into a 
rigid and therefore conf ormationally-const rained loop structure. Disulfide bonds 
between amino-terminal and carboxy-terminal cysteines may be formed, for example, 
in the cytoplasm of E. coli trxB mutant strains. Under some conditions disulfide 
bonds may also form within the cytoplasm and nucleus of higher organisms harboring 
equivalent mutations, for example, an S. cerevisiae YTR4 . sup . - mutant strain 
(Furter et al., Nucl Acids Res. 14:6357-6373, 1986; GenBank Accession Number 
P29509) . In addition, the thioredoxin fusions described herein (trxA fusions) are 
amenable to this alternative means of introducing conformational constraint, since 
the cysteines at the base of peptides inserted within the thioredoxin active-site 
loop are at a proper distance from one another to form disulfide bonds under 
appropriate conditions. 

Brief Summary Text (47): 

Conf ormationally-cons trained proteins as candidate interactors are useful in the 
invention because they are amenable to tertiary structural analysis, thus 
facilitating the design of simple organic molecule mimetics with improved 
pharmacological properties. For example, because thioredoxin has a known structure, 
the protein structure between the conf ormationally constrained regions may be more 
easily solved using methods such as NMR and X-ray difference analysis. Certain 
conformation-constraining proteins also protect the embedded protein from cellular 
degradation and/or increase the protein's solubility, and/or otherwise alter the 
capacity of the candidate interactor to interact. 

Brief Summary Text (48): 

Once isolated, interacting proteins can also be analyzed using the interaction trap 
system, with the signal generated by the interaction being an indication of any 
change in the proteins 1 interaction capabilities. In one particular example, an 
alteration is made (e.g., by standard in vivo or in vitro directed or random 
mutagenesis procedures) to one or both of the interacting proteins, and the effect 
of the alteration (s) is monitored by measuring reporter gene expression. Using this 
technique, interacting proteins with increased or decreased interaction potential 
are isolated. Such proteins are useful as therapeutic molecules (for example, 
agonists or antagonists) or, as described above, as models for the design of simple 
organic molecule mimetics. 

Brief Summary Text (49): 

Protein agonists and antagonists may also be readily identified and isolated using 
a variation of the interaction trap system. In particular, once a protein-protein 
interaction has been recorded, an additional DNA coding for a candidate agonist or 
antagonist, or preferably, one of a library of potential agonist- or antagonist- 
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encoding sequences is introduced into the host cell, and reporter gene expression 
is measured. Alternatively, candidate interactor agonist or antagonist compounds 
(i.e., including polypeptides as well as non-proteinaceous compounds, e.g., single 
stranded polynucleotides) are introduced into an in vivo or in vitro interaction 
trap system according to the invention and their ability to effect reporter gene 
expression is measured. A decrease in reporter gene expression (compared to a 
control lacking the candidate sequence or compound) indicates an antagonist. 
Conversely, an increase in reporter gene expression (compared again to a control) 
indicates an agonist. Interaction agonists and antagonists are useful as 
therapeutic agents or as models to design simple mimetics; if desired, an agonist 
or antagonist protein may be conf ormationally-cons trained to provide the advantages 
described herein. Particular examples of interacting proteins for which antagonists 
or agonists may be identified include, but are not limited to, the IL-6 receptor- 
ligand pair, TGF-.beta. receptor-ligand pair, IL-1 receptor-ligand pair and other 
receptor-ligand interactions, protein kinase-substrate pairs, interacting pairs of 
transcription factors, interacting components of signal transduction pathways (for 
example, cytoplasmic domains of certain receptors and G-proteins), pairs of 
interacting proteins involved in cell cycle regulation (for example, pl6 and CDK4 ) , 
and neurotransmitter pairs. 

Brief Summary Text (50) : 

Also included in the present invention are libraries encoding conf ormationally- 
constrained proteins. Such libraries (which may include natural as well as 
synthetic DNA sequence collections) are expressed intracellularly or, optionally, 
in cell-free systems, and may be used together with any standard genetic selection 
or screen or with any of a number of interaction trap formats for the 
identification of interacting proteins, agonist or antagonist proteins, or proteins 
that endow a cell with any identifiable characteristic, for example, proteins that 
perturb cell cycle progression. Accordingly, peptide-encoding libraries (either 
random or designed) can be used in selections or screens which either are or are 
not transcriptionally-based. These libraries (which preferably include at least 100 
different peptide-encoding species and more preferably include 1000, or 100,000 or 
greater individual species) may be transformed into any useful prokaryotic or 
eukaryotic host, with yeast representing the preferred host. Alternatively, such 
peptide-encoding libraries may be expressed in cell-free systems. 

Detailed Description Text (2) : 

Applicants have developed a novel interaction trap system for the identification 
and analysis of conf ormationally-constrained proteins that either physically 
interact with a second protein of interest or that antagonize or agonize such an 
interaction. In one embodiment, the system involves a eukaryotic host strain (e.g., 
a yeast strain) which is engineered to produce a protein of therapeutic or 
diagnostic interest as a fusion protein covalently bonded to a known DNA binding 
domain; this protein is referred to as a "bait" protein because its purpose in the 
system is to "catch" useful, but as yet unknown or uncharacterized, interacting 
polypeptides (termed the "prey"; see below) . The eukaryotic host strain also 
contains one or more "reporter genes," i.e., genes whose transcription is detected 
in response to a bait-prey interaction. Bait proteins, via their DNA binding 
domain, bind to their specific DNA recognition site upstream of a reporter gene; 
reporter transcription is not stimulated, however, because the bait protein lacks 
an activation domain. 

Detailed Description Text (3) : 

To isolate DNA sequences encoding novel interacting proteins, members of a DNA 
expression library (e.g., a cDNA or synthetic DNA library, either random or 
intentionally biased) are introduced into the strain containing the reporter gene 
and bait protein; each member of the library directs the synthesis of a candidate 
interacting protein fused to an invariant gene activation domain tag. Those 
library-encoded proteins that physically interact with the promoter-bound bait 
protein are referred to as "prey" proteins. Such bound prey proteins (via their 
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activation domain tag) detectably activate the expression of the downstream 
reporter gene and provide a ready assay for identifying a particular DNA clone 
encoding an interacting protein of interest. In the instant invention, each 
candidate prey protein is conf ormationally-cons trained (for example, either by 
embedding the protein within a conformation-constraining protein or by linking 
together the protein's amino and carboxy termini). Such a protein is maintained in 
a fixed, three-dimensional structure, facilitating mimetic drug design. 

Detailed Description Text (4) : 

An example of one interaction trap system according to the invention is shown in 
FIGS. 1A-C. FIG . 1A shows a leucine auxotroph yeast strain containing two reporter 
genes, LexAop-LEU2 and LexAop-lacZ, and a constitutively expressed bait protein 
gene. The bait protein (shown as a pentagon) is fused to a DNA binding domain 
(shown as a circle) . The DNA binding protein recognizes and binds a specific DNA- 
binding-protein recognition site (shown as a solid rectangle) operably-linked to a 
reporter gene. In FIGS. IB and 1C, the cells additionally contain candidate prey 
proteins (candidate interactors) (shown as an empty rectangle in IB and an empty 
hexagon in 1C) fused to an activation domain (shown as a solid square); each prey 
protein is embedded in a conformation-constraining protein (shown as two solid half 
circles) . FIG. IB shows that if the candidate prey protein does not interact with 
the transcriptionally-inert LexA-fusion bait protein, the reporter genes are not 
transcribed; the cell cannot grow into a colony on leu. sup.- medium, and it is 
white on Xgal medium because it contains no .beta . -galactosidase activity. FIG. 1C 
shows that, if the candidate prey protein interacts with the bait, both reporter 
genes are active; the cell forms a colony on leu. sup.- medium, and cells in that 
colony have .beta . -galactosidase activity and are blue on Xgal medium. Preferably, 
in this system, the bait protein (i.e., the protein containing a site-specific DNA 
binding domain) is transcriptionally inert, and the reporter genes (which are bound 
by the bait protein) have essentially no basal transcription. 

Detailed Description Text (8) : 

Preferably, the bait protein also includes a LexA dimerization domain; this 
optional domain facilitates efficient LexA dimer formation. Because LexA binds its 
DNA binding site as a dimer, inclusion of this domain in the bait protein also 
optimizes the efficiency of operator occupancy (Golemis and Brent, Mol. Cell Biol. 
12:3006-3014, (1992) ) . 

Detailed Description Text (16) : 

In the selection described herein, another DNA construction is utilized which 
encodes a series of candidate interacting proteins (i.e., prey proteins); each is 
conformational ly-const rained, either by being embedded in a conformation- 
constraining protein or because the prey protein's amino and carboxy termini are 
linked (e.g., by disulfide bonding). An exemplary prey protein includes an 
invariant N-terminal moiety carrying, amino to carboxy terminal, an ATG for protein 
expression, an optional nuclear localization sequence, a weak activation domain 
(e.g., the B112 or B42 activation domains of Ma and Ptashne; Cell 51:113, 1987), 
and an optional epitope tag for rapid immunological detection of fusion protein 
synthesis. Library sequences, random or intentionally designed synthetic DNA 
sequences, or sequences encoding conf ormational ly-const rained proteins, may be 
inserted downstream of this N-terminal fragment to produce fusion genes encoding 
prey proteins. 

Detailed Description Text (18) : 

Similarly, any number of activation domains may be used for that portion of the 
prey molecule ; such activation domains are preferably weak activation domains, 
i.e., weaker than the GAL 4 activation region II moiety and preferably no stronger 
than B112 (as measured, e.g., by a comparison with GAL 4 activation region II or 
B112 in parallel . beta . -galactosidase assays using lacz reporter genes); such a 
domain may, however, be weaker than B112. In particular, the extraordinary 
sensitivity of the LEU2 selection scheme allows even extremely weak activation 
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domains to be utilized in the invention. Examples of other useful weak activation 
domains include B17, B42, and the amphipathic helix (AH) domains described in Ma 
and Ptashne (Cell 51:113, 1987), Ruden et al . (Nature 350:426-430, 1991), and 
Giniger and Ptashne (Nature 330:670, 1987). 

Detailed Description Text (19) : 

The prey proteins, if desired, may include other optional nuclear localization 
sequences (e.g., those derived from the GAL 4 or MAT. alpha. 2 genes) or other 
optional epitope tags (e.g., portions of the c-myc protein or the flag epitope 
available from Immunex) . These sequences optimize the efficiency of the system, but 
are not required for its operation. In particular, the nuclear localization 
sequence optimizes the efficiency with which prey molecules reach the nuclear- 
localized reporter gene construct ( s ) , thus increasing their effective concentration 
and allowing one to detect weaker protein interactions. The epitope tag merely 
facilitates a simple immunoassay for fusion protein expression. 

Detailed Description Text (22) : 

According to one embodiment of the present invention, the DNA sequence encoding the 
prey protein is embedded in a DNA sequence encoding a conformation-constraining 
protein (i.e., a protein that decreases the flexibility of the amino and carboxy 
termini of the prey protein) . Methods for directly linking the amino and carboxy 
termini of a protein (e.g., through disulfide bonding of appropriately positioned 
cysteine residues) are described above. As an alternative to this approach, 
conformation-constraining proteins may be utilized. In general, conformation- 
constraining proteins act as scaffolds or platforms, which limit the number of 
possible three dimensional configurations the peptide or protein of interest is 
free to adopt. Preferred examples of conformation-constraining proteins are 
thioredoxin or other thioredoxin -like sequences, but many other proteins are also 
useful for this purpose. Preferably, conformation-constraining proteins are small 
in size (generally, less than or equal to 200 amino acids), rigid in structure, of 
known three dimensional configuration, and are able to accommodate insertions of 
proteins of interest without undue disruption of their structures. A key feature of 
such proteins is the availability, on their solvent exposed surfaces, of locations 
where peptide insertions can be made (e.g., the thioredoxin active-site loop). It 
is also preferable that conformation-constraining protein producing genes be highly 
expressible in various prokaryotic and eukaryotic hosts, or in suitable cell-free 
systems, and that the proteins be soluble and resistant to protease degradation. 
Examples of conformation-constraining proteins useful in the invention include 
nucleases (e.g., RNase A) , proteases (e.g., trypsin), protease inhibitors (e.g., 
bovine pancreatic trypsin inhibitor) , antibodies or rigid fragments thereof, 
conotoxins, and the pleckstrin homology domain. This list, however, is not 
limiting. It is expected that other conformation-constraining proteins having 
sequences not identified above, or perhaps not yet identified or published, may be 
useful based upon their structural stability and rigidity. 

Detailed Description Text (23) : 

As mentioned above, one preferred conformation-constraining protein according to 
the invention is thioredoxin or other thioredoxin -like proteins. As one example of 
a thioredoxin -like protein useful in this invention, E. coli thioredoxin has the 
following characteristics. E. coli thioredoxin is a small protein, only 11.7 kD, 
and can be produced to high levels. The small size and capacity for high level 
synthesis of the protein contributes to a high intracellular concentration. E. coli 
thioredoxin is further characterized by a very stable, tight tertiary structure 
which can facilitate protein purification. 

Detailed Description Text (24): 

The three dimensional structure of E. coli thioredoxin is known and contains 
several surface loops, including a distinctive Cys . . . Cys active-site loop 
between residues Cys. sub. 33 and Cys. sub. 36 which protrudes from the body of the 
protein. This Cys . . . Cys active-site loop is an identifiable, accessible surface 
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loop region and is not involved in interactions with the rest of the protein which 
contribute to overall structural stability. It is therefore a good candidate as a 
site for prey protein insertions. Human thioredoxin, glutaredoxin, and other 
thioredoxin -like molecules also contain this Cys . . . Cys active-site loop. Both 
the amino- and carboxyl-termini of E. coli thioredoxin are on the surface of the 
protein and are also readily accessible for fusion construction. E. coli 
thioredoxin is also stable to proteases, stable in heat up to 80. degree. C. and 
stable to low pH. 

Detailed Description Text (25) : 

Other thioredoxin -like proteins encoded by thioredoxin -like DNA sequences useful in 
this invention share homologous amino acid sequences, and similar physical and 
structural characteristics. Thus, DNA sequences encoding other thioredoxin -like 
proteins may be used in place of E. coli thioredoxin according to this invention. 
For example, the DNA sequence encoding other species 1 thioredoxin, e.g., human 
thioredoxin, are suitable. Human thioredoxin has a three-dimensional structure that 
is virtually superimposable on E. coli ' s three-dimensional structure, as determined 
by comparing the NMR structures of the two molecules . Forman-Kay et al . , Biochem. 
30:2685 (1991). Human thioredoxin also contains an active-site loop structurally 
and functionally equivalent to the Cys . . . Cys active-site loop found in the E. 
coli protein. It can be used in place of or in addition to E. coli thioredoxin in 
the production of protein and small peptides in accordance with the method of this 
invention. Insertions into the human thioredoxin active-site loop and onto the 
amino terminus may be as well-tolerated as those in E. coli thioredoxin . 

Detailed Description Text (26) : 

Other thioredoxin -like sequences which may be employed in this invention include 
all or portions of the proteins glutaredoxin and various species' homologs thereof 
(Holmgren, supra). Although E. coli glutaredoxin and E. coli thioredoxin share less 
than 20% amino acid homology, the two proteins do have conformational and 
functional similarities (Eklund et al . , EMBO J. 3:1443-1449 (1984)) and 
glutaredoxin contains an active-site loop structurally and functionally equivalent 
to the Cys . . . Cys active-site loop of E. coli thioredoxin . Glutaredoxin is 
therefore a thioredoxin -like molecule as defined herein. 

Detailed Description Text (27): 

In addition, the DNA sequence encoding protein disulfide isomerase (PDI), or that 
portion containing the thioredoxin -like domain, and its various species 1 homologs 
thereof (Edman et al . , Nature 317:267-270 (1985)) may also be employed as a 
thioredoxin -like DNA sequence, since a repeated domain of PDI shares >30% homology 
with E. coli thioredoxin and that repeated domain contains an active-site loop 
structurally and functionally equivalent to the Cys . . . Cys active-site loop of 
E. coli thioredoxin . The two latter publications are incorporated herein by 
re f erence f 0 r the purpose of providing information on glutaredoxin and PDI which is 
known and available to one of skill in the art. 

Detailed Description Text (28): 

Similarly the DNA sequence encoding phosphoinositide-specif ic phospholipase C (PI- 
PLC), fragments thereof, and various species 1 homologs thereof (Bennett et al., 
Nature, 334:268-270 (1988)) may also be employed in the present invention as a 
thioredoxin -like sequence based on the amino acid sequence homology with E. coli 
thioredoxin, or alternatively based on similarity in three dimensional conformation 
and the presence of an active-site loop structurally and functionally equivalent to 
Cys . . . Cys active-site loop of E. coli thioredoxin . All or a portion of the DNA 
sequence encoding an endoplasmic reticulum protein, ERp72, or various species 
homologs thereof are also , included as thioredoxin -like DNA sequences for the 
purposes of this invention (Mazzarella et al . , J. Biol. Chem. 2 65:1094-1101 (1990)) 
based on amino acid sequence homology, or alternatively based on similarity in 
three dimensional conformation and the presence of an active-site loop structurally 
and functionally equivalent to Cys . . . Cys active-site loop of E. coli 
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thioredoxin . Another thioredoxin -like sequence is a DNA sequence which encodes all 
or a portion of an adult T-cell leukemia-derived factor (ADF) or other species . 
homologs thereof (Wakasugi et al., Proc. Natl. Acad. Sci. USA, 87:8282-8286 
(1990)). ADF is now believed to be human thioredoxin . Similarly, the protein 
responsible for promoting disulfide bond formation in the periplasm of E. coli, the 
product of the dsbA gene (Bardwell et al., Cell 67:581-89, 1991) also can be 
considered a thioredoxin -like sequence. The three latter publications .are 
incorporated herein by reference for the purpose of providing information on PI- 
PLC, ERp72, ADF, and dsbA which are known and available to one of skill in the art. 



Detailed Description Text (29) : 

It is expected from the definition of thioredoxin -like sequences used above that 
other sequences not specifically identified above, or perhaps not yet identified or 
published, may be useful as thioredoxin -like sequences based on their amino acid 
sequence homology to E. coli thioredoxin or based on having three dimensional 
structures substantially similar to E. coli or human thioredoxin and having an 
active-site loop functionally and structurally equivalent to the Cys . . . Cys 
active-site loop of E. coli thioredoxin . One skilled in the art can determine 
whether a molecule has these latter two characteristics by comparing its three- 
dimensional structure, as analyzed for example by x-ray crystallography or two- 
dimensional NMR spectroscopy, with the published three-dimensional structure for E. 
coli thioredoxin and by analyzing the amino acid sequence of the molecule to 
determine whether it contains an active-site loop that is structurally and 
functionally equivalent to the Cys . . . Cys active-site loop of E. coli 
thioredoxin . By "substantially similar" in three-dimensional structure or 
conformation is meant as similar to E. coli thioredoxin as is glutaredoxin . In 
addition a predictive algorithm has been described which enables the identification 
of thioredoxin -like proteins via computer-assisted analysis of primary sequence 
(Ellis et al., Biochemistry 31:4882-91 (1992)). Based on the above description, one 
of skill in the art will be able to select and identify, or, if desired, modify, a 
thioredoxin -like DNA sequence for use in this invention without resort to undue 
experimentation. For example, simple point mutations made to portions of native 
thioredoxin or native thioredoxin -like sequences which do not effect the structure 
of the resulting molecule are alternative thioredoxin -like sequences, as are 
allelic variants of native thioredoxin or native thioredoxin -like sequences. 

Detailed Description Text (30) : 

DNA sequences which hybridize to the sequence for E. coli thioredoxin or its 
structural homologs under either stringent or relaxed hybridization conditions also 
encode thioredoxin -like proteins for use in this invention. An example of one such 
stringent hybridization condition is hybridization at 4.times.SSC at 65. degree. C, 
followed by a washing in 0 . IX SSC at 65. degree. C. for an hour. Alternatively an 
exemplary stringent hybridization condition is in 50% formamide, 4. times. SSC at 
42. degree. C. Examples of non-stringent hybridization conditions are 4. times. SSC at 
50. degree. C. or hybridization with 30-40% formamide at 42. degree. C. The use of 
all such thioredoxin -like sequences are believed to be encompassed in this 
invention. 

Detailed Description Text (31) : 

It may be preferred for a variety of reasons that prey proteins be fused within the 
active-site loop of thioredoxin or thioredoxin -like molecules . The face of 
thioredoxin surrounding the active-site loop has evolved, in keeping with the 
protein's major function as a nonspecific protein disulfide oxido-reductase, to be 
able to interact with a wide variety of protein surfaces. The active-site loop 
region is found between segments of strong secondary structure and this provides a 
rigid platform to which one may tether prey proteins. 

Detailed Description Text (32): 

A small prey protein inserted into the active-site loop of a thioredoxin -like 
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protein is present in a region of the protein which is not involved in maintaining 
tertiary structure. Therefore the structure of such a fusion protein is stable. 
Indeed, E. coli thioredoxin can be cleaved into two fragments at a position close 
to the active-site loop, and yet the tertiary interactions stabilizing the protein 
remain . 

Detailed Description Text (33) : 

The active-site loop of E. coli thioredoxin has the sequence NH.sub.2 . . . 
Cys.sub.33 -Gly-Pro-Cys . sub . 36 . . . COOH. Fusing a selected prey protein with a 
thioredoxin -like protein in the active loop portion of the protein- constrains the 
prey at both ends, reducing the degrees of conformational freedom of the prey 
protein, and consequently reducing the number of alternative structures taken by 
the prey. The inserted prey protein is bound at each end by cysteine residues, 
which may form a disulfide linkage to each other as they do in native thioredoxin 
and further limit the conformational freedom of the inserted prey. 

Detailed Description Text (34): 

In addition, by being positioned within the active-site loop, the prey protein is 
placed on the surface of the thioredoxin -like protein, an advantage for use in 
screening for bioactive protein conformations and other assays. In general, the 
utility of thioredoxin or other thioredoxin -like proteins is described in McCoy et 
al., U.S. Pat. No. 5,270,181 and LaVallie et al . , Bio/Technology 11:187-193 (1993). 
These two references are hereby incorporated by reference. 

Detailed Description Text (35) : 

There now follows a description of thioredoxin interaction trap systems according 
to the invention. These examples are designed to illustrate, not limit, the 
invention. 

Detailed Description Text (36) : 
Thioredoxin Interaction Trap System 

Detailed Description Text (37) : 

Interaction trap systems utilizing conf ormationally-constrained proteins have been 
developed for the detection of protein interactions, the identification and 
isolation of proteins participating in such interactions, the identification and 
isolation of agonists and antagonists of such interactions, and the identification 
and isolation of interacting peptide aptamers that may be used in protein detection 
assays in a manner analogous to antibody-type reagents. Exemplary systems are now 
described. 

Detailed Description Text (38) : 

1. Thioredoxin Interaction Trap with Cdk2 Bait 

Detailed Description Text (39) : 

Progression of eukaryotic cells through the cell cycle requires the coordinated 
action of a number of regulatory proteins that interact with and regulate the 
activity of Cdks (Sherr, Cell 79:551-555 (1994)). These modulatory proteins include 
cyclins, which positively regulate Cdk activity, Cyclin Dependent kinase inhibitors 
(Ckis) , and a number of protein kinases and phosphatases, some of which, such as 
CAK and Cdc25, positively regulate kinase activity, some of which, such as Weel, 
inhibit kinase activity, and some of which, such as Cdil (Gyuris et al . , Cell 
75:791-803 (1993)), have effects that are so far unknown (reviewed in Morgan, 
Nature 374:131-134 (1995)). Cdk2 is thought to be required for higher eukaryotic 
cells to progress from Gi into S-phase (Fang & Newport, J. Cell Biol. 66:731-742 
(1991); Pagano et al . , J. Cell Biol. 121:101-111 (1993); van den Heuvel & Harlow, 
Science 262: 2050-2054 (1993)). Cdk2 kinase activity is positively regulated by 
Cyclin E and Cyclin A (Koff et al . , Science 257:168 9-1694 (1992); Dulic et al., 
Science 257 : 1958-1961 ■ (1992 ) ; Tsai et al . , Nature 353:174-7 (1991)) and negatively 
regulated by p21, p27 and p57 (Harper et al . , Cell 75:805-816 (1993); Polyak et 
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al., Genes Dev. 8:9-22 (1994); Toyoshirna & Hunter, Cell 78:67-74 (1994); Matsuoka 
et al., Genes Dev. 9:660-662 (1995); Lee et al., Genes Dev. 9:639-649 (1995)); in 
addition, Cdk2 complexes with Cdil at the Gl to S transition (Gyuris et al . , 
supra) . Here we describe the use of a yeast two-hybrid system to select molecules 
which recognize Cdk2 from combinatorial libraries. 

Detailed Description Text (40) : 

A prey vector is constructed containing the E . coli thioredoxin gene (trxA) . pJG 4- 
4 (Gyuris et al . , supra) is used as the vector backbone and cut with EcoRI and 
Xhol. A DNA fragment encoding the B112 transcription activation domain is obtained 
by PCR amplification of plasmid LexA-B112 (Doug Ruden, Ph.D. thesis, Harvard 
University, 1992) and cut with Muni and Ndel. The E. coli trxA gene is excised from 
the vector pALTRXA-7 81 (U.S. Pat. No. 5,292,646; InVitrogen Corp., San Diego, 
Calif.) by digestion with Ndel and Sail . The trxA and B112 fragments are then 
ligated by standard techniques into the EcoRI/XhoI-cut pJG 4-4 backbone, forming 
pYENAeTRX. This vector encodes a fusion protein comprising the SV4 0 nuclear 
localization domain, the B112 transcription activation domain, an hemagglutinin 
epitope tag, and E. coli thioredoxin (FIG. 2) . 

Detailed Description Text (41): 

Peptide libraries are constructed as follows. The DNA oligomer 5' GACTGACTGGTCCG 
(NNK).sub.20 GGTCCTCAGTCAGTCAG 3' (with N=A, C, G, T and K=G, T) (SEQ ID NO: 4) is 
synthesized and annealed to the second oligomer (5' C T G AC T G AC T GAG G AC C 3') (SEQ ID 
NO: 5) in order to form double stranded DNA at the 3' end of the first oligomer. 
The second strand is enzymatically completed using Klenow enzyme, priming synthesis 
with the second oligomer. The product is cleaved with Avail, and inserted into 
RsrII cut pYENAeTRX. After ligation, the construct is used to transform E. coli by 
standard methods (Ausubel et al., Current Protocols in Molecular Biology, (Greene 
and Wiley-interscience, New York, 1987-1994)). The library contained 
2 . 9 . times . 10 . sup. 9 members, of which more than 10. sup. 9 directed the synthesis of 
peptides. Twenty-mers were chosen as preferred peptides because they were long 
enough to fold into many different patterns of shape and charge and short enough 
that many of the encoding oligonucleotides lacked stop codons . Because of the 
presence of fortuitous restriction sites in some coding oligonucleotides and 
because some library members contained double inserts, approximately one fifth of 
the constrained peptides were longer or shorter than unit length. 

Detailed Description Text (42) : 

To screen for interacting peptides or "aptamers, " 100 .mu.g of the library was used 
to transform the yeast strain EGY48 (Mata his3 leu2 : : 2Lexop-LEU2 ura3 trpl LYS2; 
Gyuris et al . , supra). This strain also contained the reporter plasmid pSH 18-34, a 
pLRl. DELTA. 1 derivative, containing the yeast 2 .mu. replication origin, the URA3 
gene, and a GALl-lacZ reporter gene with the GALl upstream regulatory elements 
replaced with 4 colEl LexA operators (West et al., Mol . Cell Biol. 4:2467, 1984; 
Ebina et al., J. Biol. Chem. 258:13258, 1983; Hanes and Brent, Cell 57:1275, 1989), 
as well as the bait vector pLexA202-Cdk2 (Cdk2 encodes the 'human cyclin dependent 
kinase 2, an essential cell cycle enzyme) (Gyuris et al . , supra; Tsai et al., 
Oncogene 8:1593, 1993). About 2 . 5 . times . 10 . sup . 6 transf ormants are obtained and 
pooled. The first selection step, growth on leucine-def icient medium after 
induction with 2% galactose/1% raffinose (Gyuris et al., supra; Guthrie and Fink, 
Guide to Yeast Genetics and Molecular Biology, Vol. 194, 1991), was performed with 
an 8-fold redundancy (20 . times . 10 . sup . 6 cfu) of the library in yeast, and about 900 
colonies were obtained after growth at 30. degree. C. for 5 days. The 300 largest 
colonies were streak purified and tested for the galactose-dependent expression of 
the LEU2 gene product and of . beta . -galactosidase (encoded by pSH 18-34), the 
latter giving rise to blue yeast colonies in the presence of Xgal in the medium 
(Ausubel et al . , supra). Thirty-three colonies fulfilled these requirements which, 
after sequencing, included 14 different clones, all of which bound specifically to 
a LexA-Cdk2 . bait but not to LexA or to a LexA-Cdk3 bait (Finley et al., Proc. Natl. 
Acad. Sci. USA 91:12980-12984 (1994)). The strength of binding was judged according 
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to the intensity of the blue color formed by a colony of the yeast that contained 
each different interactor. By this means, each interactor was classified as a 
strong, medium, or weak binder, which was normalized to the amount of blue color 
caused by the various naturally-occurring partner proteins of Cdk2 in side by side 
mating interaction assays. An example of the peptide sequence of one representative 
of each class is given here: 

Detailed Description Text (51) : 

In related experiments, 6 additional aptamers (i.e., pep6 (SEQ ID NO: 21), pep7 
(SEQ ID NO: 22), pep9 (SEQ ID NO: 23), pepl2 (SEQ ID NO: 24), pepl3 (SEQ ID NO: 
25), and pepl4 (SEQ ID NO: 26) were shown to interact with the LexA-Cdk2 bait but 
not with unrelated proteins such as Max or Rb, or with certain Cdk family members 
such as Cdk4, which shares 47% sequence identity with Cdk2 (FIG. 4A) . However, some 
aptamers interacted with other Cdk family members. The fact that different peptide 
aptamers showed distinct patterns of cross-reactivity with different Cdks indicated 
that these aptamers recognized different epitopes conserved among various Cdks. The 
sequence of the peptide loops is shown in FIG. 4B. Non-unit-length peptides 
occurred at the same frequency among the Cdk2 interacting aptamers as in the 
library as a whole. No aptamer showed significant sequence similarity to known 
proteins, as expected if the 20-mer peptides indeed formed novel recognition 
structures. All of the peptides were charged, suggesting that some of their 
interactions with the Cdk2 target could be ionic. 

Detailed Description Text (56) : 

To determine the binding affinities of these aptamers for Cdk2, the following 
experiments were carried out. Based on interpolation from interaction trap 
calibration experiments (Estojak et al., Mol. Cell. Biol. 15:5820-5829 (1995)), the 
robust transcription that some of the aptamers of FIGS. 4A and 4B directed from the 
pSH18-34 reporter suggested that the equilibrium dissociation constants (Kds) of 
the interactions was <10.sup.-6 M. In order to precisely measure the binding 
affinity of the aptamers to Cdk2, we used an evanescent wave instrument (BIAcore, 
Pharmacia, Piscataway, N.J.). Purified His6-Cdk2 was coupled to CM-dextran chips, 
and peptide aptamers flowed in running buffer over the chips. Following binding, 
the chips were rinsed with running buffer lacking aptamer. 

Detailed Description Text (59) : v 

The ability to select TrxA-peptides that interact specifically with designated 
intracellular baits allows for the creation of other classes of intracellular 
reagents. For example, appropriately derivitized TrxA-peptide fusions may allow the 
creation of antagonists or agonists (as described above) . Alternatively, peptide 
fusions allow for the creation of homodimeric or heterodimeric "matchmakers, " which 
force the interaction of particular protein pairs. In one particular example, two 
proteins are forced together by utilizing a leucine zipper sequence attached to a 
t conformation-constraining protein containing a candidate interaction peptide. This 
protein can bind to both members of a protein pair of interest and direct their 
interaction. Alternatively, the "matchmaker" may include two different sequences, 
one having affinity for a first polypeptide and the second having affinity for the 
second polypeptide; again, the result is directed interaction between the first and 
second polypeptides. Another practical application for the peptide fusions 
described herein is the creation of "destroyers, " which target a bound protein for 
destruction by host proteases. In an example of the destroyer application, a 
protease is fused to one component of an interacting pair and that component is 
allowed to interact with the target to be destroyed (e.g., a protease substrate). 
By this method, the protease is delivered to its desired site of action and its 
proteolytic potential effectively enhanced. Yet another application of the fusion 
proteins described herein are as "conformational stabilizers," which induce target 
proteins to favor a particular conformation or stabilize that conformation. In one 
particular example, the ras protein has one conformation that signals a cell to 
divide and another conformation that signals a cell not to divide. By selecting a 
peptide or protein that stabilizes the desired conformation, one can influence 
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whether a cell will divide. Other proteins that undergo conformational changes 
which increase or decrease activity can also be bound to an appropriate 
"conformational stabilizer" to influence the property of the desired protein. 

Detailed Description Text (63) : 

Peptide 13 does not affect the growth of a cdc28-lNts strain at high temperature 
when the defect is complemented by a plasmid expressing wild-type Cdc28 product, 
and has no effect on yeast at the permissive temperature. While we do not intend to 
be bound by any particular theory, it appears that this peptide blocks yeast cell 
cycle progression by binding to some face of the Cdk2 molecule and inhibiting its 
function and thereby interfering with its ability to interact with cyclins, other 
partners, or with substrates. 

Detailed Description Text (64): 

In later experiments with the aptamers of FIG . 4B, inhibition of Cdk2 activity by 
these peptides (for example, by binding to a face of the molecule and by blocking 
its interaction with one of its partner proteins or substrates) was examined. In 
particular, the ability of the aptamers to inhibit phosphorylation of Histone Hi by 
Cdk2/Cyclin E kinase was tested. To carry out these experiments, 2 . times . 10 . sup . 7 
Sf9 cells were co-infected with recombinant bacculoviruses expressing 
hemagglutinin-tagged Cdk2 and His6-Cyclin E as described (Kato et al . , Genes & Dev. 
7:331-342 (1993); Desai et al . , Mol . Biol. Cell 3:571-582 (1992)). Cells were lysed 
40 hours after infection in 500 .mu.l of 1 . times . Kinase Buffer (Kato et al., 
supra), and 5 .mu.l of a 100-fold diluted extract was used in 30 .mu.l reactions. 
Reactions were carried out for 20 minutes at 25. degree. C. by adding 2.5 .mu.Ci of 
[ .gamma. . sup. 32 P] ATP (3000 Ci/mmol) , 25 .mu.M ATP, 100 ng of Histone HI (Sigma, 
St. Louis, Mo.), and varying amounts of His6-TrxA or His6-aptamers . Samples were 
run on 15% SDS-PAGE gels and exposed by autoradiography. 

Detailed Description Text (65) : 

The results of these experiments are shown in FIG. 8. All tested aptamers were able 
to inhibit phosphorylation of Histone HI by Cdk2/Cyclin E kinase. Under standard 
conditions (pH 7.5, 0 mM NaCl) (Kato et al . , supra), apparent half-inhibitory 
concentrations ranged from 1.5 to 100 nM. To rule out the possibility that a trace 
bacterial contaminant was responsible for the inhibition, we removed the His6- 
peptide aptamer from the Pep2 preparation with a rabbit polyclonal anti -thioredoxin 
antiserum; this immunodepleted preparation no longer inhibited Cdk2 kinase 
activity. Half -inhibitory concentrations of aptamers were lower than the Kds 
measured from evanescent wave experiments, consistent with the idea that some of 
the energy of each interaction is ionic and is reduced by the salt in the 
evanescent wave instrument running buffer. 

Detailed Description Text (67): 

Previous studies have established that libraries of unconstrained peptides contain 
sequences capable of recognizing targets in vitro (Devlin et al., Science 24 9:404- 
406 (1990); Cwirla et al., Proc. Natl. Acad. Sci . USA 87:6378-6382 (1990); Lam et 
al., Nature 354:82-84 (1991); Songyang et al., Current Biology 4:973-982 (1994); 
Scott et al., Current Biology 5:40-48 (1994)) and in yeast (Yang et al., Nucl. 
Acids. Res. 23:1152-1156 (1995)); such isolated peptide sequences often bear 
similarity to natural interactors. By contrast, although constrained peptide 
libraries are less conf ormationally diverse (McConnell et al . , Gene 151:115-118 
(1994)), the lack of conformational diversity should lower the entropic cost if 
binding causes the loop to adopt a single conformation (Spolar et al . , Science 
263:777-784 (1994)); this reduction in entropic cost may account for the fact that 
our Cdk2 peptide aptamers recognize their targets with higher affinity than is 
typically observed for unconstrained peptides (Yang et al . , supra; Oldenburg et 
al., Proc. Natl. Acad. Sci. USA 89:5393-5397 (1992); McLafferty et al., Gene 
128:29-36 (1993)). This high affinity suggests that peptide aptamers may inhibit 
protein function in vivo, in the simplest case by binding to specific faces of the 
target molecule and disrupting its interaction with specific partners or effectors. 
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Detailed Description Text (68) : 

The ability to generate large numbers of aptamers from combinatorial libraries, 
taken together with the interaction trap, which offers a powerful selection for 
those that bind specific proteins, facilitates the selection of peptide aptamers 
against a variety of intracellular targets. Aptamers which inhibit protein contacts 
can be used to aid the dissection of the networks of protein interactions that 
govern division of higher eukaryotic cells and can also be used for the genetic 
analysis of those metazoan organisms for which isolation of specific missense 
alleles may be impractical. The analogy of the aptamers of the invention with 
antibodies indicates that peptide aptamers can also be used in other applications 
in which immunological reagents are now employed, such as ELISAs, 
immunofluorescence experiments, and sensors. If desired, the affinity of these 
aptamers may be increased, for example, by increasing their valency and using 
existing interaction technology to select mutants that bind more tightly. This 
first generation of peptide aptamers facilitates the production of recognition 
modules for intracellular nanotechnologies aimed at destroying, modifying, and 
assembling macromolecules inside cells. 

Detailed Description Text (69) : 

3. Thioredoxin Interaction Trap with OncoRas Bait 
Detailed Description Text (73) : 

Such mutationally-activated conformational changes in GTP-bound H-ras mutants 
provide targets for members of a conf ormat ionally constrained random peptide 
library. In the present example, the library is a conf ormationally constrained 
thioredoxin peptide library, as described above. Library members, which interact 
with oncogenic Ras have been identified using a variation of the interaction trap 
technology provided above. The oncogenic Ras peptide aptamers isolated may be 
assayed for their ability to disrupt the interaction of oncogenic Ras with known 
effectors and to inhibit cellular transformation. 

Detailed Description Text (79) : 

Aliquots (50 .mu.l) of the cells were then incubated at 30. degree. C. for 30 min. 
with 1 .mu.g of thioredoxin peptide library DNA, 70 .mu.g of salmon sperm DNA, and 
300 .mu.l of sterile 40% PEG 4000 in LiOAc/TE. The mixtures were heat-shocked at 
42. degree. C. for 15 min. Each aliquot was plated onto a 24 cm. times. 24 cm plate 
containing glucose Ura-His-Trp- medium and was incubated at 30. degree. C. for two 
days. The transforming efficiency typically ranged from 50,000 to 100,000 colony 
forming units per .mu.g of library DNA. 

Detailed Description Text (81) : 

One .mu.l of each sample was used to transform E. coli KC8 cells by 

electroporation. Bacterial transf ormants were selected on minimal agar supplemented 
with uracil, leucine, histidine, and ampicillin. Each type transformant resulted in 
final isolation of plasmid which a leucine marker, which carries a DNA fragment 
encoding thioredoxin -peptide fusion protein. 

Detailed Description Text (86) : 

The protein interaction assays described herein can also be accomplished in a cell- 
free, in vitro system. Such a system may begin with a DNA construct including a 
reporter gene operably linked to a DNA-binding-protein recognition site (e.g., a 
LexA binding site). To this DNA is added a bait protein (e.g., any of the bait 
proteins described herein bound to a LexA DNA binding domain) and a prey protein 
(e.g., one of a library of conf ormat ionally-cons trained candidate interactor prey 
proteins bound to a gene activation domain) . Interaction between the bait and prey 
protein is assayed by measuring the reporter gene product, either as an RNA 
product, as an in vitro translated protein product, or by some enzymatic activity 
of the translated reporter gene product. Alternatively, interactions involving 
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conf ormationally constrained proteins may be carried out by direct in vitro 
techniques, for example, by any standard physical or biochemical technique for 
identifying protein interactions (such as immobilization of a first protein on a 
column or other solid support and contact with a conf ormationally-constrained 
protein) . These direct in vitro approaches are preferably carried out in such a way 
that the DNA encoding the conf romationally -cons trained protein may be readily 
isolated, for example, by using techniques involving phage display or display of 
the protein on the E . coli flagella. 

Detailed Description Text (88) : 

In one particular embodiment, interacting proteins identified in vitro are tested 
for their ability to interact in vivo. Such in vivo interacting proteins may be 
used for any diagnostic or therapeutic purpose. For example, proteins shown to 
interact in vivo may be used to disrupt, encourage, or stablize intracellular 
interactions or may be used as an intracellular antibody-type reagent. 

CLAIMS : 

2. A method of assaying an interaction between a first protein and a second 
protein, comprising : 

(a) providing said first protein; 

(b) providing a fusion protein comprising said second protein, said second protein 
having reduced structural flexibility due to disulfide bonding between cysteine 
residues at said second protein's amino- and carboxy-termini ; 

(c) contacting said first protein with said fusion protein under conditions which 
allow complex formation; 

(d) detecting said complex; and 

(e) determining whether said first protein interacts with said fusion protein 
inside a cell by the steps of: 

(i) providing a host cell which contains (a) a reporter gene operably linked to a 
DNA-binding-protein recognition site; (b) said first protein covalently boned to a 
DNA binding protein which specifically binds to said DNA-binding-protein 
recognition site; and (c) said second protein covalently bonded to a gene 
activating moiety; and 

(ii) measuring expression of said reporter gene as a measure of an interaction 
between said first protein and said second protein. 

4. A method of assaying an interaction -between a first protein and a second 
protein, comprising: 

(a) providing said first protein; 

(b) providing a fusion protein comprising said second protein, said second protein 
having reduced structural flexibility due to covalent bonding of both the amino and 
carboxy termini of said second protein to a conformation-constraining protein; 

(c) contacting said first protein with said fusion protein under conditions which 
allow complex formation; 

(d) detecting said complex; and 

(e) determining whether said first protein interacts with said fusion protein 
inside a cell by the steps of: 
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(i) providing a host cell which contains (a) a reporter gene operably linked to a 
DNA-binding-protein recognition site; (b) said first protein covalently bonded to a 
DNA binding protein which specifically binds to said DNA-binding-protein 
recognition site; and (c) said second protein covalently bonded to a gene 
activating moiety; and 

(ii) measuring expression of said reporter gene as a measure of an interaction 
between said first protein and said second protein. 

6. The method of claim 4, wherein said conformation-constraining protein is 
thioredoxin . 

7. The method of claim 4, wherein said conformation-constraining protein is a 
thioredoxin - like molecule, said thioredoxin -like molecule being characterized by 
having a three-dimensional structure substantially similar to that of human or E. 
coli thioredoxin and containing an active site loop. 

8. The method of claim 6, wherein said second protein is inserted into the active 
site loop of said thioredoxin . 

9. The method of claim 7, wherein said active site loop has a structure 
substantially similar to that of human or E. coli thioredoxin or glutaredoxin. 

10. A method for detecting a first protein in a sample, comprising 

(a) contacting said sample with a second protein having reduced structural 
flexibility due to covalent bonding of both the amino- and carboxy-termini of said 
second protein to a thioredoxin -like molecule, said thioredoxin -like molecule 
having a three-dimensional structure substantially similar to that of human or E. 
coli thioredoxin and containing an active site loop, said first protein or said 
second protein being an intracellular protein, said contacting being carried out 
under conditions which allow said second protein to specifically bind to said first 
protein and form a complex; and 

(b) detecting said complex. 

11. The method of claim 10, wherein said thioredoxin -like molecule is thioredoxin . 

12. The method of claim 11, wherein said second protein is inserted into the active 
site loop of said thioredoxin . 

13. The method of claim 10, wherein said active site loop has a structure 
substantially similar to that of human or E. coli thioredoxin or glutaredoxin. 

14. The method of claim 10, wherein said second protein is embedded within said 
thioredoxin -like protein. 

17. The method of claim 1, 2, or 4, wherein said first protein or said second 
protein is an intracellular protein. 
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